Goto

Collaborating Authors

 ladder network



Semi-supervised Learning with Ladder Networks

Antti Rasmus, Mathias Berglund, Mikko Honkala, Harri Valpola, Tapani Raiko

Neural Information Processing Systems

We combine supervised learning with unsupervised learning in deep neural networks. The proposed model is trained to simultaneously minimize the sum of supervised and unsupervised cost functions by backpropagation, avoiding the need for layer-wise pre-training. Our work builds on top of the Ladder network proposed by V alpola [1] which we extend by combining the model with supervision. We show that the resulting model reaches state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification in addition to permutation-invariant MNIST classification with all labels.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

Submitted by Assigned_Reviewer_1 Q1 This paper proposes to apply a recent method for deep unsupervised learning called ladder neural network to supervised learning tasks, by combining the original objectives with an additional supervised objective applied at the top of the ladder network. The ladder neural network idea consists of learning as many denoising autoencoding criterions as there are layers in the network, and where the denoising uses the representation at the given layer, and in the next layer. The method is simple and straightforward, and can be graphically depicted as a neural network (as it is done in Figure 1). Particular attention is dedicated to the choice of the denoising architecture, where the multiplicative interaction between the lateral and top-down connections are made explicit in the model. However, authors show that the choice of denoising model is not crucial, and good results can also be obtained with a variety of denoising models.


Reviews: Domain Separation Networks

Neural Information Processing Systems

I like the idea of domain separation even though it is not new. However, I do not think the current draft sufficiently validate the proposed approach. First of all, neither the experimental settings nor the datasets used are standard. To have a direct comparison to other methods (e.g., [7], [29], [17], [26], etc.), standard settings and benchmark datasets should be used. For the current draft, it is not clear the better performance is due to better hyper-parameter tuning on the validation set or from the proposed model. What's more, since the baseline methods in the original papers are NOT tested on these datasets, it might be that with proper tuning of parameters on the validation set (that not available in the standard setting), the baseline methods achieve better performance.


Semi-supervised Learning with Ladder Networks

Neural Information Processing Systems

We combine supervised learning with unsupervised learning in deep neural networks. The proposed model is trained to simultaneously minimize the sum of supervised and unsupervised cost functions by backpropagation, avoiding the need for layer-wise pre-training. Our work builds on top of the Ladder network proposed by Valpola (2015) which we extend by combining the model with supervision. We show that the resulting model reaches state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification in addition to permutation-invariant MNIST classification with all labels.


Recurrent Ladder Networks

Isabeau Prémont-Schwarz, Alexander Ilin, Tele Hao, Antti Rasmus, Rinu Boney, Harri Valpola

Neural Information Processing Systems

We propose a recurrent extension of the Ladder networks [22] whose structure is motivated by the inference required in hierarchical latent variable models. We demonstrate that the recurrent Ladder is able to handle a wide variety of complex learning tasks that benefit from iterative inference and temporal modeling. The architecture shows close-to-optimal results on temporal modeling of video data, competitive results on music modeling, and improved perceptual grouping based on higher order abstractions, such as stochastic textures and motion cues.


Semi-Supervised Learning with Ladder Networks

Neural Information Processing Systems

We combine supervised learning with unsupervised learning in deep neural networks. The proposed model is trained to simultaneously minimize the sum of supervised and unsupervised cost functions by backpropagation, avoiding the need for layer-wise pre-training. Our work builds on top of the Ladder network proposed by Valpola [1] which we extend by combining the model with supervision. We show that the resulting model reaches state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification in addition to permutationinvariant MNIST classification with all labels.


Semi-supervised Learning with Ladder Networks

Rasmus, Antti, Berglund, Mathias, Honkala, Mikko, Valpola, Harri, Raiko, Tapani

Neural Information Processing Systems

We combine supervised learning with unsupervised learning in deep neural networks. The proposed model is trained to simultaneously minimize the sum of supervised and unsupervised cost functions by backpropagation, avoiding the need for layer-wise pre-training. Our work builds on top of the Ladder network proposed by Valpola (2015) which we extend by combining the model with supervision. We show that the resulting model reaches state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification in addition to permutation-invariant MNIST classification with all labels. Papers published at the Neural Information Processing Systems Conference.


Convolutional Ladder Networks for Legal NERC and the Impact of Unsupervised Data in Better Generalizations

Cardellino, Cristian (National University of Córdoba) | Alemany, Laura Alonso (National University of Córdoba) | Teruel, Milagro (National University of Córdoba) | Villata, Serena (Université Côte d'Azur) | Marro, Santiago (National University of Córdoba)

AAAI Conferences

In this paper we adapt the semi-supervised deep learning architecture known as Convolutional Ladder Networks, from the domain of computer vision, and explore how well it works for a semi-supervised Named Entity Recognition and Classification task with legal data. The idea of exploring a semi-supervised technique is to asses the impact of large amounts of unsupervised data (cheap to obtain) in specific tasks that have little annotated data, in order to develop robust models that are less prone to overfitting. In order to achieve this, first we must check the impact on a task that is easier to measure. We are presenting some preliminary results, however, the experiments carried out show some very interesting insights that foster further research in the topic.


Generative Models For Deep Learning with Very Scarce Data

Maroñas, Juan, Paredes, Roberto, Ramos, Daniel

arXiv.org Machine Learning

The goal of this paper is to deal with a data scarcity scenario where deep learning techniques use to fail. We compare the use of two well established techniques, Restricted Boltzmann Machines and Variational Auto-encoders, as generative models in order to increase the training set in a classification framework. Essentially, we rely on Markov Chain Monte Carlo (MCMC) algorithms for generating new samples. We show that generalization can be improved comparing this methodology to other state-of-the-art techniques, e.g. semi-supervised learning with ladder networks. Furthermore, we show that RBM is better than VAE generating new samples for training a classifier with good generalization capabilities.